Week 3(June 18-22)

Overview

This week I toiled in my quest to get the Caffe library to work on the lab computer. I had many setbacks and dead-ends.

Monday

This day I continued my attempt to impliment squeezenet by installing a new library (Caffe). One error, that was first encountered on Thursday, was insurmountable. Other attempts following other tutorials also encountered infinite, unfixable errors. Maybe someday I will have moved on from installing Caffe, but that day was not this day.

Tuesday

Caffe Day 2. This day I reached out to my director's colleagues's student workers to see if they could help. They offered valuable assistance, but even at the end of this day I could not get Caffe working on the lab's computer.

However, I set up a virtual machine on my laptop running the same OS as the lab's computer. Using that fresh install, I did manage to install Caffe (I did not manage to do anything with it because my laptop does not have an nVidia GPU and therefore cannot run CUDA-reliant programs, which is all we use in the lab). This informs me that it is possible to install Caffe. This next day I should have been able to make progress.

Wednesday

Caffe Day 3. This day I did indeed make progress. I eventually did get Caffe to go through all installation steps (make all -> make test -> make runtest -> make pycaffe) but ran into new issues, such as the lab's method for testing trained models breaks before freezing the computer for about a minute. Relatedly, after the successful run of 'make runtest', any time now will fail with an error also related to GPU issues. I reached out to my director's colleague's student workers again with my new problems to see if they could assist me with also these issues.

Additionally, the Caffe installation cannot have been correct (or something else is wrong in this computer that I am slowly ruining) since the demo script for the program I spent my last three days getting Caffe for does not run since it either needs to append to sys.path in-script every time it runs to allow it to fail to find some other module.

Thursday

Caffe Day 4. This day I did successfully get Caffe to fully install on the lab computer and to run the demo script for SqueezeDet. Truly an accomplishment for the ages. I did not yet do further work with SqueezeDet, however, because I was focused on getting the lab's network-testing script back to working. Other people have had this exact error before:

RuntimeErrorL cuda runtime error (59) : device-side assert tiggered at [path/to/file.c:##]

but none of their solutions help me solve my problem. I do not know this code nor this system well enough to effectively solve this issue on my own. We will have seen what I managed.

I did discover that the testing script from other login on this computer, from my predecessor, still works fine. The plot thickens. What did I do to this poor computer. I checked and my runtime version is (now that I changed it) the same as theirs, so it is not that. Their script is an older version of mine, it might have something to do with that. I will get to the bottom of this someday.

Friday

This day I followed a new plan - I set up a virtual machine for on the lab computer, same OS, so that I could do a fresh install on a fresh OS and see if that way I could get Caffe and the testing script to work and if so I could then check the differences between the VM and the host machine to, with hope, find whatever issue I have.

However, as it turns out, virtual machines (at least the one from VirtualBox) have their own drivers for things like GPUs, which means I could not install CUDA and related driver to the VM. Since I could not do that, I was not going to be able to run either the testing script nor Caffe; this brought the end to this plan.

Afterward, I checked my predecessor's login on the lab computer and discovered an older version of the training and testing scripts we use. I tested them and it turns out they still work. I copied them to my login so that I could work with them another day to determine what about them was different from my newer ones and fix what broke.